203 research outputs found

    Writer identification using curvature-free features

    Get PDF
    Feature engineering takes a very important role in writer identification which has been widely studied in the literature. Previous works have shown that the joint feature distribution of two properties can improve the performance. The joint feature distribution makes feature relationships explicit instead of roping that a trained classifier picks up a non-linear relation present in the data. In this paper, we propose two novel and curvature-free features: run-lengths of local binary pattern (LBPruns) and cloud of line distribution (COLD) features for writer identification. The LBPruns is the joint distribution of the traditional run-length and local binary pattern (LBP) methods, which computes the run-lengths of local binary patterns on both binarized and gray scale images. The COLD feature is the joint distribution of the relation between orientation and length of line segments obtained from writing contours in handwritten documents. Our proposed LBPruns and COLD are textural-based curvature-free features and capture the line information of handwritten texts instead of the curvature information. The combination of the LBPruns and COLD features provides a significant improvement on the CERUG data set, handwritten documents on which contain a large number of irregular-curvature strokes. The results of proposed features evaluated on other two widely used data sets (Firemaker and IAM) demonstrate promising results

    Multi-Layer Support Vector Machines

    Get PDF

    Multi-Layer Support Vector Machines

    Get PDF

    Multi-Layer Support Vector Machines

    Get PDF

    Separability versus prototypicality in handwritten word-image retrieval

    Get PDF
    Hit lists are at the core of retrieval systems. The top ranks are important, especially if user feedback is used to train the system. Analysis of hit lists revealed counter-intuitive instances in the top ranks for good classifiers. In this study, we propose that two functions need to be optimised: (a) in order to reduce a massive set of instances to a likely subset among ten thousand or more classes, separability is required. However, the results need to be intuitive after ranking, reflecting (b) the prototypicality of instances. By optimising these requirements sequentially, the number of distracting images is strongly reduced, followed by nearest-centroid based instance ranking that retains an intuitive (low-edit distance) ranking. We show that in handwritten word-image retrieval, precision improvements of up to 35 percentage points can be achieved, yielding up to 100% top hit precision and 99% top-7 precision in data sets with 84 000 instances, while maintaining high recall performances. The method is conveniently implemented in a massive scale, continuously trainable retrieval engine, Monk. (C) 2013 Elsevier Ltd. All rights reserved

    Separability versus Prototypicality in Handwritten Word Retrieval

    Get PDF
    User appreciation of a word-image retrieval system is based on the quality ofa hit list for a query. Using support vector machines for ranking in largescale, handwritten document collections, we observed that many hit listssuffered from bad instances in the top ranks. An analysis of this problemrevealed that two functions needed to be optimised concerning bothseparability and prototypicality. By ranking images in two stages, the numberof distracting images is reduced, making the method very convenient formassive scale, continuously trainable retrieval engines. Instead of cumbersomeSVM training, we present a nearest-centroid method and show that precisionimprovements of up to 35 percentage points can be achieved, yielding up to100% precision in data sets with a large amount of instances, whilemaintaining high recall performances.<br/
    • …
    corecore